feat: Improve action unadvertising#1248
Conversation
|
@Mergifyio backport kilted jazzy humble |
✅ Backports have been createdDetails
Cherry-pick of cfd0646 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of cfd0646 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of cfd0646 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
…lities Both `services.call_service` and `advertise_action`'s `AdvertisedActionHandler` create rclpy entities (`rclpy.Client` and `rclpy.action.ActionServer`, respectively) on the rosbridge node directly from a worker thread. The same node is concurrently spun by the rclpy executor, so the executor's wait-set rebuild can race with the worker thread's entity registration / destruction and either crash inside `_rclpy` (action servers SIGSEGV in `rclpy/action/server.py:__init__`) or leave the executor holding a handle whose underlying PyCapsule has been freed (services surface this later as `TypeError: Object of type 'NoneType' is not an instance of 'capsule'`). This is the service- and action-capability analogue of the IncomingQueue race fixed in RobotWebTools#1183: route the lifecycle calls through the executor via `executor.create_task(...)` so they serialize with the executor's own work instead of competing with it. - Extract the executor-routing pattern into `rosbridge_library.internal.executor_helpers`, exposing `run_on_executor` (synchronous; returns the value, used when the caller needs the entity back) and `schedule_on_executor` (fire-and-forget, used for destructors). Both fall back to inline invocation when the node has no attached executor, preserving unit-test behaviour. - Apply `run_on_executor` to client creation in `services.call_service` and ActionServer creation in `AdvertisedActionHandler.__init__`. The three `destroy_client` sites in `services.call_service` now use `schedule_on_executor`. ActionServer destruction was already routed through the executor in RobotWebTools#1248. - `ActionTester` in `test_actions.py` constructs its own ActionServer from the test thread while the fixture's executor is already spinning. Route that creation through `run_on_executor` for the same reason; without it the rclpy SIGSEGV reproduces on a workspace with up-to-date rclpy. `test_stress_service_clients.py` exercises the service race window (parallel callers, barrier-synchronized starts, rapid sequential cycles, and interleaved waves of concurrent callers) and asserts the executor remains healthy and responsive afterwards.
…lities Both `services.call_service` and `advertise_action`'s `AdvertisedActionHandler` create rclpy entities (`rclpy.Client` and `rclpy.action.ActionServer`, respectively) on the rosbridge node directly from a worker thread. The same node is concurrently spun by the rclpy executor, so the executor's wait-set rebuild can race with the worker thread's entity registration / destruction and either crash inside `_rclpy` (action servers SIGSEGV in `rclpy/action/server.py:__init__`) or leave the executor holding a handle whose underlying PyCapsule has been freed (services surface this later as `TypeError: Object of type 'NoneType' is not an instance of 'capsule'`). This is the service- and action-capability analogue of the IncomingQueue race fixed in RobotWebTools#1183: route the lifecycle calls through the executor via `executor.create_task(...)` so they serialize with the executor's own work instead of competing with it. - Extract the executor-routing pattern into `rosbridge_library.internal.executor_helpers`, exposing `run_on_executor` (synchronous; returns the value, used when the caller needs the entity back) and `schedule_on_executor` (fire-and-forget, used for destructors). Both fall back to inline invocation when the node has no attached executor, preserving unit-test behaviour. - Apply `run_on_executor` to client creation in `services.call_service` and ActionServer creation in `AdvertisedActionHandler.__init__`. The three `destroy_client` sites in `services.call_service` now use `schedule_on_executor`. ActionServer destruction was already routed through the executor in RobotWebTools#1248. - `ActionTester` in `test_actions.py` constructs its own ActionServer from the test thread while the fixture's executor is already spinning. Route that creation through `run_on_executor` for the same reason; without it the rclpy SIGSEGV reproduces on a workspace with up-to-date rclpy. `test_stress_service_clients.py` exercises the service race window (parallel callers, barrier-synchronized starts, rapid sequential cycles, and interleaved waves of concurrent callers) and asserts the executor remains healthy and responsive afterwards.
Public API Changes
None
Description
Previously, action servers were never actually destroyed and just left hanging. This PR improves the logic so that now, when an action gets unadvertised:
The test confirms that the user can advertise the action, unadvertise the action and then advertise the action with the same name again.
I'm still not sure whether
action_server.destroy()is perfectly safe but I didn't experience any crashes when testing it.